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SHORT COMMUNICATION 

Bringing spiders to the multilocus era: novel anonymous nuclear markers for Harpactocrates 
ground-dwelling spiders (Araneae: Dysderidae) with application to related genera 
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Abstract. Multilocus approaches are essential for accurately recovering the evolutionary processes underlying species and 
population history. However, historical inferences in non-model organisms are still almost exclusively based on 
mitochondrial DNA due to the difficulty of obtaining multiple informative loci. Here, we use a genomic library based 
approach, to generate 15 novel anonymous nuclear markers (ANMs) from the ground spider Harpactocrates globifer 
Ferrandez 1986. The ANMs cross-amplify and sequence well in the target species and its two closest relatives, and some of 
them also in the more distantly related H. rovastelhis Simon 1914. Levels of nucleotide diversity of the ANMs within H. 
globifer ranged from 0.05% to 1.4% and average sequence divergence between close congeneric relatives from 0.02% to 
13.9%, supporting the utility of these loci in population and species-level analyses. Moreover, a cross-species amplification 
screened in other spider taxa showed that some of the loci could also potentially be useful in more distantly related genera. 
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Multilocus approaches based on unlinked markers, including both 
mitochondrial (mtDNA) and nuclear markers, are essential for an 
accurate reconstruction of the evolutionary processes underlying 
species and population history. Inferences based exclusively on 
mtDNA markers are hampered by processes such as sex-biased 
behaviors (e.g., sex-biased dispersal), low dispersal ability and small 
population sizes, incomplete lineage sorting, introgression or selective 
sweeps (Avise 2000; Hare 2001; Irwin 2002; Ballard & Whitlock 2004; 
Wilkins 2004; Kuo & Avise 2005). In recent years the realization that 
gene trees and species trees can differ markedly in closely related 
species has sparked the development of a new generation of methods 
for species tree inference, which rely strongly on the information 
provided by multiple markers (Maddison 1997; Edwards 2009). 
Moreover, many studies have shown that the statistical power of 
parameters such as divergence times, population size and rates of gene 
flow, rely on the number of loci used (see Brumfield et al. 2003; Brito 
& Edwards 2009 and references therein). Recent advances, both 
technological (see reviews of Hudson 2007; Thomson et al. 2010; 
Ekblom & Galindo 2011) and methodological (e.g., Hey & Nielsen 
2007; Heled & Drummond 2010; Hey 2010; Yang & Rannala 2010), 
have greatly facilitated the implementation of multilocus analyses. 
Unfortunately, the scarcity of available genomic information makes 
finding multiple informative loci in non-model organisms a daunting 
task (Thomson et al. 2010). This is especially true for spiders, in which 
just a handful of mitochondrial and nuclear markers that are variable 
at the population and species level can be reliably amplified and 
sequenced (see e.g., Hedin 1997 for nadl; Maddison & Hedin 2003 for 
nadl, 16S and EF-1 alpha; Ayoub et al. 2007 for EF-1 gamma; 
Bidegaray-Batista et al. 2007 for coxl and 16S-nadl fragment; Bond 
& Stockman 2008 for 12S-16S fragment and ITS-1 and ITS-2; Vink et 
al. 2008 for Actin 5C; Garb & Gillespie 2009 for coxl, 16S-nadl 
fragment and EF-1 alpha; Muster et al. 2009 for ITS-1, 5.8S rDNA 
and ITS-2; De Busschere et al. 2010 for coxl and 28S). 

Here, we report on a method to circumvent this limitation by 
isolating novel anonymous nuclear markers (ANMs) from a spider 
genomic library. Our target organisms are species belonging to the 
ground-dwelling genus Harpactocrates Simon 1914 (family Dysder¬ 
idae) which inhabit the mountain ranges of the Sistema Central in the 


Iberian Peninsula. From west to east, the species Harpactocrates 
gredensis Ferrandez 1986, H. globifer Ferrandez 1986 and H. gurdus 
Simon 1914 are found along the range, showing narrow, almost non¬ 
overlapping distributions. As with other species (about 13) in the 
genus, they are generally found at high elevation (above 1000 m) in 
temperate and moist forests, suggesting a preference for cool and 
humid environments. Preliminary phylogenetic and phylogeographic 
studies have shown that Pleistocene glaciations had a major role in 
structuring populations in Harpactocrates. These observations have 
led us to hypothesize that Harpactocrates species in the Sistema 
Central underwent population range expansion toward lower 
elevations during cooler periods, facilitating gene flow, whereas at 
interglacial periods they retreated to high elevation refuges, which 
then led to population fragmentation (e.g., Masta 2000; Knowles 
2001; DeChaine & Martin 2005; Carstens & Knowles 2007b; 
Mardulyn et al. 2009; Schmitt 2009). With the aim of testing this 
hypothesis using a multilocus species/population approach, we have 
developed 15 novel ANMs from a genomic library of H. globifer, 
which cross-amplify and sequence well in the closely related species H. 
gurdus and H. gredensis and some of them in a more distant species H. 
ravastellus Simon 1914. Additionally, we have screened cross-species 
amplification in other spider taxa in order to evaluate the range and 
potential application of the novel markers (see below). 

We constructed a genomic library from the pooled DNA of two 
H. globifer specimens. Total genomic DNA was extracted using 
SpeedTools Tissue DNA extraction kit (Biotools) and concentrated 
via ethanol precipitations. Genomic DNA (~10 pg) was digested with 
EcoRI restriction enzyme (50 pL total volume). The enzyme was 
selected following the recommendations of Carstens & Knowles 
(2006). The goal was to ensure that the enzyme did not cut the 
mitochondrial DNA into fragments smaller than 1.6 kilobase pairs 
(kb), thereby reducing the likelihood of cloning mtDNA fragments. 
EcoRI was selected based on the information provided by a complete 
mitochondrial genome of a new species of Parachtes Alicata 1964 
(pending formal description, Pons, Bidegaray-Batista & Arnedo 
unpublished data), which is the sister genus to Harpactocrates. 
Digested DNA was visualized on a 1% agarose gel stained with 
ethidium bromide, and fragments between 0.8 and 1.5 kb in length 


506 



BIDEGARAY-BATISTA ET AL.—ANMS FOR HARPACTOCRATES 


507 


Table 1.—Primer sequences for fifteen anonymous nuclear loci developed from a genomic library of Harpatocrates globifer. Names indicate 
the loci, forward and reverse primers, PCR annealing temperature T ( C), size of amplicon and GenBank Accession No. of the original cloned 
fragment. Performance codes indicate successful amplification in the following species: H. globifer (1), H. gurdus (2), H. gredensis (3) and H. 
ravastellus (4), Parachtes romandiolae (a), Dysdera erythrina (b), Harpactea corticalis (c), Loxosceles rufescens (d), Troglohyphantes lucifuga and 
T. pedemontanus (e), Nemesia randa and Iberesia brauni (f). PCR produets resulting in double bands (db) or multiple bands (mb) are indicated 
as superscripts. 


Locus 

Primer sequences (5'-3') 

T (° C) 

Amplicon size (bp) 

Accession no. 

Performance 

qgLi 

F: AGACAGCATTCAGAGTCCAAGCG 

R2: GCCGAAATAGTTTGAGCTCGTTTGCG 

56 

416 

JN654497 

1,2,3,4, a mb 

qgL5 

F: TGCCCACGCCCCACTAAAATAGG 

R: GCCAGGTTGCCAGTTAAAATCACG 

58 

424 

JN654498 

1,2,3,4, a, b 

qgLIO 

F: AGCGACACATCCTTACCTGCGT 

R: GCGCATCTGGAGAGCCTTTGA 

58 

621 

JN654506 

1,2,3,4, a mb , e 

qgL12 

F: TGGCACAGCAGTGGCCAGAA 

R: CATGTCAACCGAATAGAATC 

55 

617 

JN654509 

1,2,3 

qgL21 

F: ACGCCCGAACCGACCTTTGC 

R: ACGAGGGAGGTGCTAGAAGCG 

58 

580 

JN654499 

1,2,3, a db , f mb 

qgL22 

F: ACGATGCACCCTCGAAATGGTCG 

R: TTGGCGCGGAACCTCTCAGC 

58 

508 

JN654511 

1,2,3,4, a db 

qgL25 

F: TGCCCACGCCACCCCTATCC 

R:AGGCCAACGCGAAAAGTCAGC 

59 

309 

JN654510 

1,2,3,4, a, b, d 

qgL26 

F: TCCCCGGGTCACGTGGGAAG 

R: TCCCCAACGTGAACGAACCGC 

57 

586 

JN654500 

1,2,3,4, a mb 

qgL28 

F: GTCCCGTCGTCCGGGGTTTG 

R: GCCACCCATGCTTTTTGTGCTCC 

59 

401 

JN654508 

1,2,3,4, a 

qgL29 

F: TGGACTCCCGTTTCACAAGGCG 

R: CCACGCTATAATTGGCCCACAAGC 

53 

459 

JN654502 

1,2,3, a mb 

qgL32 

F: AGCCCCAACATCCGTTTGACC 

R: CGTCGGCAAAAGGGACCACCC 

58 

567 

JN654503 

1,2,3,4, a mb 

qgL33 

F: ACGTGCCACACTCTCTCTCTTTCG 

R: TGCCTGCTCGAATCAAACCCAGC 

58 

343 

JN654504 

1,2,3 

qgL34 

F: CGCGTCACCATAGGCTCTTCGC 

R: AGTTCGGTGTGTGGCCGAGC 

58 

408 

JN654507 

1,2,3, a mb , b db 

qgL36 

F: AGCGTAACGTAGGGCGTGCAG 

R: CGGCCGCTTGGAAGGTGGTC 

58 

627 

JN654501 

1,2,3,4, a mb , b mb , f db 

qgL38 

F: TCACAGGCATACGAAAGCGCC 

R: GCCGAGCGTAGGCCTAGCATAAC 

58 

454 

JN654505 

1,2,3,4, a 


were excised and purified from the gel using the kit QIAquick Gel 
Extraction kit (QIAGEN). Excised fragments were concentrated via 
ethanol precipitations to 8 ng/pL and cloned using the CloneJET™ 
PCR Cloning kit (Fermentas) and One Shot TOPIO competent cells 
(Invitrogen). Transformed cells were plated on agar plates containing 
ampicillin. Five hundred colonies were individually picked with a 
pipette tip and added directly to a 25 pL PCR mix containing the 
primers included in the cloning kit. Amplified inserts were visualized 
on a 1% agarose gel stained with ethidium bromide. Of the 500 
amplified inserts, 206 that ranged in size from 300 to 1200 bp were 
selected for sequencing. Forty-one sequences were discarded because 
they were either identical to other sequences or did not sequence well. 
i The remaining 165 sequences were locally blasted against the 
Parachtes sp. mitochondrial genome to ensure that there were no 
mtDNA sequences among the selected inserts, and in all cases they 
turned out to be part of the nuclear genome. Subsequently, we 
performed nucleotide BLAST searches against the GenBank database 
! to characterize sequences. Most sequences (160 out of 165) reported 
non-significant BLAST hits. Thirty-eight of these sequences were 
j subsequently selected for primer design based on features that ensured 
that they were non-coding regions, such as the presence of multiple 
stop codons in each of the six reading frames, and avoiding poly-A or 
-G runs, which are difficult to sequence. The primer pairs designed 
were initially screened in a set of individuals consisting of one from 
each of the three target species (H. globifer , H. gurdus and H. 
gredensis) and one belonging to a more distantly related species (//. 


ravastellus from the Pyrenees). Each primer pair was tested in a 
standard 25 pL PCR mix with the following conditions: 3 min at 94° 
C followed by 30 cycles of denaturation at 94 C for 30 s, annealing at 
three different temperatures of 52° C, 55° C and 58° C for 35 s, and 
extension at 72° C for 1 min, with a final single extension step of 72° C 
for 10 min. Of the 38 previously selected loci, 15 produced a single 
band product for at least one of the annealing temperatures tested 
and sequenced well in the three target species (Table 1). The 
annealing temperature of the PCR reaction was subsequently 
optimized (see Table 1), and the extension time was adjusted 
according to the length of the fragment. 

Eight to 10 individuals of H. globifer from different populations 
were sequenced for each locus (Table 2) to investigate how 
informative the markers might be for population and phylogeo- 
graphic studies. In addition, diversity indices and population statistics 
of the anonymous makers were compared with those of known 
mitochondrial genes and nuelear introns. With this aim, a mitochon¬ 
drial fragment ( 16S-nadl ) spanning the 3' half of the 16S rRNA 
ribosomal subunit ( rrnL ), the complete tRNA leu ( trnLl ), the 5' half 
of the NADH dehydrogenase subunit I ( nadl ) and the nuclear intron 
of the signal recognition particle 54-kDA subunit ( srp54 ) were 
amplified and sequenced for 10 individuals of H. globifer and a single 
individual of H. gurdus , H. gredensis and H. ravastellus , respectively. 
The 16S-nadl fragment was amplified with primer pairs LR-N-13398 
(Simon et al. 1994) and Nl-J-12350 (5'-CCTARTTGRCTAR- 
ARTTRGCRSATCARCCAATTG -3') and the srp54 with SRP54fl 
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Table 2.—Overall diversity measures and statistics estimated across 15 anonymous loci, srp54 intron and 16s-nadl mitochondrial fragment for 
Harpactocrates globifer. Sample size (SS) indicates the number of individuals sequenced for each locus, in brackets the number of individuals 
used in the estimations considering the 0.9 probability threshold in PHASE. The length (L) in bp for each locus after sequences end-trimming and 
excluding sites with gaps. The lengths of indels are indicated as they occurred in each locus (N/A: not applicable, if indels were not observed). The 
number of segregating sites (S) and haplotypes (H), nucleotide diversity (rc), minimum number of recombination events (R M ) of Hudson (1985), 
linkage disequilibrium statistic (ZZ) of Rozas et al. (2001), Tajima’s D test (D) of Tajima (1989), and the HKA test of Hudson (1987). Not 
significant (ns) and significant (*) values at P < 0.05 of statistics after coalescence simulations. 


H. globifer HKA Test (H. globifer! 

//. gredensis) srp54 vs. 


Locus 

SS 

L 

Indel length 

S 

H 

K 

Rm 

ZZ 

D 

other locus 

qgLl 

10 (10) 

313 

N/A 

8 

3 

0.0031 

0 

0.0000 

-1.9072 * 

P = 0.4764 ns 

qgL5 

10 (10) 

342 

N/A 

5 

4 

0.0039 

2 ns 

0.0477 ns 

-0.1917 ns 

P = 0.5292 ns 

qgLIO 

9(6) 

540 

1, 8 

12 

7 

0.0057 

0 

0.1707 ns 

-0.9653 ns 

P = 0.7549 ns 

qgLl 2 

8(8) 

542 

L 2 

16 

7 

0.0071 

0 

0.2002 ns 

-0.8083 ns 

P = 0.2500 ns 

qgL21 

10(9) 

496 

L 2 

15 

10 

0.0088 

1 ns 

0.1057 ns 

-0.0070 ns 

P = 0.7649 ns 

qgL22 

10(10) 

429 

3, 1, 2 

16 

5 

0.0140 

0 

0.0360 ns 

1.2410 ns 

P = 0.4838 ns 

qgL25 

10(9) 

223 

6, 1 

1 

2 

0.0009 

0 

0.1170 ns 

-0.5290 ns 

P = 0.2839 ns 

qgL26 

10(8) 

512 

N/A 

2 

3 

0.0005 

0 

0.0000 

-1.4979 * 

P = 0.3564 ns 

qgL28 

10(10) 

317 

N/A 

8 

6 

0.0068 

0 

- 0.0570 ns 

-0.1529 ns 

P = 0.4341 ns 

qgL29 

10(9) 

373 

N/A 

12 

6 

0.0070 

1 ns 

0.1465 ns 

-0.9463 ns 

P = 0.9255 ns 

qgL32 

10(9) 

490 

5, 11, 3, 10 

21 

5 

0.0119 

0 

0.0665 ns 

-0.1766 ns 

P = 0.5201 ns 

qgL33 

10(10) 

266 

N/A 

12 

7 

0.0142 

1 ns 

-0.0074 ns 

0.4320 ns 

P = 0.5290 ns 

qgL34 

10(10) 

319 

1 

13 

8 

0.0108 

0 

0.0678 ns 

-0.4578 ns 

P = 0.8684 ns 

qgL36 

10(8) 

552 

N/A 

11 

6 

0.0057 

0 

-0.0444 ns 

-0.1786 ns 

P = 0.5850 ns 

qgL38 

10(9) 

366 

N/A 

5 

3 

0.0067 

0 

0.2154 ns 

2.1431 ns 

P = 0.3583 ns 

srp54 

10(10) 

119 

N/A 

7 

8 

0.0165 

1 ns 

-0.0154 ns 

-0.0165 ns 

- 

16S-nadl 

10 (10) 

851 

1 

76 

9 

0.0237 

7 ns 

-0.0103 ns 

-1.5465 ns 

P = 0.1021 ns 


and SRP52rl (Jarman et al. 2002). PCR conditions were as follows: 
94° C for 2 min; 35 X (94 C for 35 s; 45° C for 45 s for 16S-nodl and 
from 45 C to 50 C for 35 s for srp54\ 72 C for 45 s for I6S-nadl and 
35 s for srp54; 72 C for 5 min). 

The allelic phase of heterozygotic individuals was resolved using the 
algorithms provided by PHASE (Stephens et al. 2001; Stephens & 
Donnelly 2003) and implemented in DnaSP v5 (Librado & Rozas 
2009). In the case where direct sequencing detected multiple copies of 
different lengths, either as a result of heterozygote individuals or 
paralogy, the PCR products were cloned and four to eight colonies 
sequenced. Cloning results were compatible with alleles of different 
length (i.e., high sequence similarity) rather than paralogous copies, 
assuming that polymorphisms due to singleton mutations among 
sequenced colonies represented errors of the Taq polymerase enzyme 
or cloning artifacts (see Villablanca et al. 1998; Calderon et al. 2009; 
Cummings et al. 2010; Dawson et al. 2010). 

The taxonomic range of applicability of the novel markers was 
evaluated by screening a sample of specimens belonging to groups at 
different hierarchical levels: Parachtes romandiolae Caporiacco 1949 
and Dysdera erytbrina Walckenaer 1802 (Araneomorphae, Haplogy- 
nae, Dysderidae, Dysderinae), Harpactea corticalis Simon 1882 
(Araneomorphae, Haplogynae, Dysderidae, Harpacteinae), Loxosce- 
les rufeseens Dufour 1820 (Araneomorphae, Haplogynae, Sicariidae), 
Troglohyphantes lucifugo Simon 1884 and T. pedewontanus Gozo 
1908 (Araneomorphae, Entelegynae, Linyphiidae), and Nemesici 
rcmda Decae 2005 and Iberesia brcnmi L. Koch 1882 (Mygalomor- 
phae, Nemesiidae). Two independent PCR amplifications were 
performed for each locus using the following touchdown cycle: 94 
C 2 min; 10X [94 C for 35 s, 63 C for 35 s (-0.5 C/cycle), 72 C for 
1 min]; 10X [94 C for 35 s, 58 C for 35 s, 72 C for 1 min]; 15X [94 
C for 35 s, 52 C for 35 s, 72° C for 1 min]; 72 C for 10 min. Results 
of the amplification success are summarized in Table 1. 

The average interspecific sequence divergence (p distance) of the 
anonymous loci calculated with DnaSP v5 ranged from 0.02% 
(qgL26) to 8.1% (qgL28) between H. globifer and H. gurdus, 0.41% 
(qgL26) to 6.9% (qgL33) between II. globifer and H. gredensis , 1.8% 


(qgL25) to 11.1% (qgLl) between II. globifer and II. ravastellus, 
0.39% (qgL26) to 8.5% (qgL28) between H. gurdus and H. gredensis, 
1.3% (qgL25) to 13.9% (qgL28) between H. gurdus and H. ravastellus, 
and 2.4% (qgL25) to 9.4% (qgLl) between H. gredensis and H. 
ravastellus. Percent nucleotide diversity within H. globifer ranged 
from 0.05% (qgL26) to 1.4% (qgL33). Genetic diversity indices, 
recombination and linkage disequilibrium measures and neutrality 
tests were calculated for each locus in H. globifer using DnaSP v5. 
Results are summarized in Table 2. We applied the HKA test of 
Hudson (1987) to assess whether levels of polymorphism and 
divergence were correlated, as predicted by neutral theory. The test 
was performed by comparing polymorphism in H. globifer and 
divergence between H. globifer and H. gredensis in the intron srp54 
versus each of the 15 anonymous loci and versus the mitochondrial 
fragment 16s-nadl. Neither recombination nor linkage disequilibrium 
were found at any locus. None of the HKA tests showed deviation 
from neutral predictions, although Tajima’s D test showed significant 
negative values for the loci qgLl and qgL26. The mitochondrial 
fragment l6S-nadl was 1.4 times more variable than the intron srp54, 
and from 1.7 to 47 times more variable than the anonymous loci I 
qgL33 and qgL26, which are the most and least variable ANM’s, I 
respectively. 

This study reports the first ANMs ever designed specifically for 
spiders. These kinds of markers, however, have been isolated and 
extensively applied in other organisms; e.g., grasshoppers (Carstens & 
Knowles 2006), halfbeaks and killifishes (De Bruyn et al. 2010a, b), 
oysters (Hare et al. 1996), birds (Lee & Edwards 2008) and lizards 
(Rosenblum et al. 2007; Noonan & Yoder 2009). ANMs are 
promising markers for multilocus approaches at the population and 
species level of closely related and recently diverged species, as has 
been shown in recent studies (Carstens & Knowles 2007a; Carstens & 
Richards 2007; Knowles et al. 2007; Lee & Edwards 2008). The novel 
ANMs designed here are potentially useful not only for the target 
species, but also for more distantly related species in the genus (e.g., j 
H. ravastellus), as well as for species in related genera (e.g. Parachtes), I 
or even subfamilies (e.g., Dysdera, see Table 1). However, the 
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applicability of these novel markers is compromised at higher 
taxonomic levels. A significant and negative relationship between 
performance and evolutionary distance from the target species is also 
observed among the popular microsatellite markers (Primmer et al. 
1996). Although the strategy implemented in our study greatly 
facilitates and speeds up the isolation of variable markers for 
populations and species, the advent of the so-called next-generation 
sequencing (NGS) methods promise to revolutionize the discovery of 
novel markers in non-model organism even further. In fact, the 
protocols used here for ANM isolation can be easily coupled with 
NGS methods, simply by replacing the cloning step, to generate 
literally hundreds or thousands of novel markers for spiders. 
Moreover, NGS allows sequencing of lots of markers from multiple 
individuals and populations simultaneously. No doubt NGS methods 
will spawn a whole new era for phylogenetic, phylogeographic and 
population genetic studies in spiders and their allies. 
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